Constrained Contextual Bandit Learning for Adaptive Radar Waveform Selection
نویسندگان
چکیده
A sequential decision process in which an adaptive radar system repeatedly interacts with a finite-state target channel is studied. The capable of passively sensing the spectrum at regular intervals, provides side information for waveform selection process. transmitter uses sequence observations as well feedback from collocated receiver to select waveforms accurately estimate parameters. It shown that problem can be effectively addressed using linear contextual bandit formulation manner both computationally feasible and sample efficient. Stochastic adversarial models are introduced, allowing achieve effective performance broad classes physical environments. Simulations radar-communication coexistence scenario, radar-jammer demonstrate proposed substantial improvement detection when Thompson sampling EXP3 algorithms used drive Further, it harmful impacts pulse-agile behavior on coherently processed data mitigated by adopting time-varying constraint radar’s catalog.
منابع مشابه
Q-Learning-Based Adaptive Waveform Selection in Cognitive Radar
Cognitive radar is a new framework of radar system proposed by Simon Haykin recently. Adaptive waveform selection is an important problem of intelligent transmitter in cognitive radar. In this paper, the problem of adaptive waveform selection is modeled as stochastic dynamic programming model. Then Q-learning is used to solve it. Q-learning can solve the problems that we do not know the explici...
متن کاملAdaptive Representation Selection in Contextual Bandit with Unlabeled History
We consider an extension of the contextual bandit setting, motivated by several practical applications, where an unlabeled history of contexts can become available for pre-training before the online decisionmaking begins. We propose an approach for improving the performance of contextual bandit in such setting, via adaptive, dynamic representation learning, which combines offline pre-training o...
متن کاملResearch on Adaptive Waveform Selection Algorithm in Cognitive Radar
Cognitive radar is a new framework of radar system proposed by Simon Haykin recently. Adaptive waveform selection is an important problem of intelligent transmitter in cognitive radar. In this paper, the problem of adaptive waveform selection is modeled as stochastic dynamic programming model. Then backward dynamic programming, temporal difference learning and Q-learning are used to solve this ...
متن کاملContextual Bandit Algorithms with Supervised Learning Guarantees
We address the problem of competing with any large set of N policies in the nonstochastic bandit setting, where the learner must repeatedly select among K actions but observes only the reward of the chosen action. We present a modification of the Exp4 algorithm of Auer et al. [2], called Exp4.P, which with high probability incurs regret at most O( √ KT lnN). Such a bound does not hold for Exp4 ...
متن کاملContextual Bandit Learning with Predictable Rewards
Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on the action and context. We consider this problem under a realizability assumption: there exists a function in a (known) function class, always capable of predicting the expected reward, given the action and context. Unde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Aerospace and Electronic Systems
سال: 2022
ISSN: ['1557-9603', '0018-9251', '2371-9877']
DOI: https://doi.org/10.1109/taes.2021.3109110